Interpretese vs. Translationese: The Uniqueness of Human Strategies in Simultaneous Interpretation

نویسندگان

  • He He
  • Jordan L. Boyd-Graber
  • Hal Daumé
چکیده

Computational approaches to simultaneous interpretation are stymied by how little we know about the tactics human interpreters use. We produce a parallel corpus of translated and simultaneously interpreted text and study differences between them through a computational approach. Our analysis reveals that human interpreters regularly apply several effective tactics to reduce translation latency, including sentence segmentation and passivization. In addition to these unique, clever strategies, we show that limited human memory also causes other idiosyncratic properties of human interpretation such as generalization and omission of source content. 1 Human Simultaneous Interpretation Although simultaneous interpretation has a key role in today’s international community,1 it remains underexplored within machine translation (MT). One key challenge is to achieve a good quality/speed tradeoff: deciding when, what, and how to translate. In this study, we take a data-driven, comparative approach and examine: (i) What distinguishes simultaneously interpreted text (Interpretese2) from batchtranslated text (Translationese)? (ii) What strategies do human interpreters use? Unlike consecutive interpretation (speakers stop after a complete thought and wait for the interpreter), simultaneous interpretation has the interpreter to translate while listening to speakers. Language produced in the process of translation is often considered a dialect of the target language: “Translationese” (Baker, 1993). Thus, “Interpretese” refers to interpreted language. Most previous work focuses on qualitative analysis (Bendazzoli and Sandrelli, 2005; Camayd-Freixas, 2011; Shimizu et al., 2014) or pattern counting (Tohyama and Matsubara, 2006; Sridhar et al., 2013). In contrast, we use a more systematic approach based on feature selection and statistical tests. In addition, most work ignores translated text, making it hard to isolate strategies applied by interpreters as opposed to general strategies needed for any translation. Shimizu et al. (2014) are the first to take a comparative approach; however, they directly train MT systems on the interpretation corpus without explicitly examining interpretation tactics. While some techniques can be learned implicitly, the model may also learn undesirable behavior such as omission and simplification: byproducts of limited human working memory (Section 4). Prior work studies simultaneous interpretation of Japanese↔English (Tohyama and Matsubara, 2006; Shimizu et al., 2014) and Spanish↔English (Sridhar et al., 2013). We focus on Japanese↔English interpretation. Since information required by the target English sentence often comes late in the source Japanese sentence (e.g., the verb, the noun being modified), we expect it to reveal a richer set of tactics.3 Our contributions are three-fold. First, we collect new human translations for an existing simultaneous interpretation corpus, which can benefit future comparative research.4 Second, we use classification and feature selection methods to examine linguistic characterisThe tactics are consistent with those discovered on other language pairs in prior work, with additional ones specific to head-final to head-initial languages. https://github.com/hhexiy/interpretese tics comparatively. Third, we categorize human interpretation strategies, including word reordering tactics and summarization tactics. Our results help linguists understand simultaneous interpretation and help computer scientists build better automatic interpretation systems. 2 Distinguishing Translationese and Interpretese In this section, we discuss strategies used in Interpretese, which we detect automatically in the next section. Our hypothesis is that tactics used by interpreters roughly fall in two non-exclusive categories: (i) delay minimization, to enable prompt translation by arranging target words in an order similar to the source; (ii) memory footprint minimization, to avoid overloading working memory by reducing communicated information. Segmentation Interpreters often break source sentences into multiple smaller sentences (CamaydFreixas, 2011; Shimizu et al., 2013), a process we call segmentation. This is different from what is commonly used in speech translation systems (Fujita et al., 2013; Oda et al., 2014), where translations of segments are directly concatenated. Instead, humans try to incorporate new information into the precedent partial translation, e.g., using “which is” to put it in a clause (Table 1, Example 3), or creating a new sentence joined by conjunctions (Table 1, Example 5). Passivization Passivization is useful for interpreting from head-final languages (e.g., Japanese, German) to head-initial languages (e.g., English, French) (He et al., 2015). Because the verb is needed early in the target sentence but only appears at the end of the source sentence, an obvious strategy is to wait for the final verb. However, if the interpreter uses passive voice, they can start translating immediately and append the verb at the end (Table 1, Examples 4– 5). During passivization, the subject is often omitted when obvious from context. Generalization Camayd-Freixas (2011) and AlKhanji et al. (2000) observe that interpreters focus on delivering the gist of a sentence rather than duplicating the nuanced meaning of each word. More frequent words are chosen as their retrieval time is faster (Dell and O’Seaghdha, 1992; Cuetos et al., inter https://tagul.com/cloud/2 1 of 1 3/23/16, 9:02 AM Figure 1: A word cloud visualization of Interpretese (black) and Translationese (gold). 2006) (e.g., “honorific” versus “polite” in Table 1, Example 1). Although Volansky et al. (2013) show that generalization happens in translation too, it is likely more frequent in Interpretese given the severe time constraints. Summarization Faced with overwhelming information, interpreters need efficient ways to encode meaning. Less important words, or even a whole sentence can drop, especially when the interpreter falls behind the speaker. In Table 1, Example 2, the literal translation “as much as possible” is reduced to “very”, and the adjective “Japanese” is omitted. Before we study these characteristics quantitatively in the next section, we visualize Interpretese and Translationese by a word cloud in Figure 1. The size of each word is proportional to the difference between its frequencies in Interpretese and Translationese (Section 3). The word color indicates whether it is more frequent in Interpretese (black) or Translationese (gold). “the” is over-represented in Interpretese, a phenomenon also occurs in Translationese vs. the original text (Eetemadi and Toutanova, 2014). More conjunction words (e.g., “and”, “so”, “or”, “then”) are used in Interpretese, likely for segmentation, whereas “that” is more frequent in Translationese—a sign of clauses. In addition, the pronoun “I” occurs more often in Translationese while “be” and “is” occur more often in Interpretese, which is consistent with our passivization hypothesis. Source (S), translation (T) and interpretation (I) text Tactic 1 (S) この日本語の待遇表現の特徴ですが英語から日本語へ直訳しただけでは表現できないと いった特徴があります. generalize segment 〈 ∥〉 (omit) (T) (One of) the characteristics of honorific Japanese is that it can not be adequately expressed when using a direct translation (from English to Japanese). (I) Now let me talk about the characteristic of the Japanese polite expressions. 〈 ∥〉 And such such expressions can not be expressed enough just by translating directly. 2 (S) で三番目の特徴としてはですねえ出来る限り自然な日本語の話言葉とてその出力をすると いったような特徴があります. generalize :::::::: summarize (omit) (T) Its third characteristic is that its output is, : as ::::: much :: as :::::: possible, in the natural language of spoken (Japanese). (I) And the third feature is that the translation could be produced in a :::: very natural spoken language. 3 (S) まとめますと我々は派生文法という従来の学校文法とは違う文法を使った日本語解析を 行っています.その結果従来よりも単純な解析が可能となっております. segment 〈 ∥〉 (omit) (T) In sum , we’ve conducted an analysis on the Japanese language , using a grammar different from school grammar, called derivational grammar. (As a result,) we were able to produce a simpler analysis (than the conventional method). (I) So, we are doing Japanese analysis based on derivational grammar, 〈 ∥〉 which is different from school grammar, 〈 ∥〉 which enables us to analyze in simple way. 4 (S) つまり例えばこの表現一は認識できますが二から四は認識できない. generalize passivize segment 〈 ∥〉 (T) They might recognize expression one but not expressions two to four. (I) The phrase number one only is accepted 〈 ∥〉 and phrases two, three, four were not accepted. 5 (S) 以上のお話をまとめますと自然な発話というものを扱うことができる音声対話の方法とい うことを考案しました. generalize passivize segment 〈 ∥〉 (T) In summary , we have devised a way for voice interaction systems to handle natural speech. (I) And this is the summary of what I have so far stated. The spontaneous speech can be dealt with by the speech dialog method 〈 ∥〉 and that method was proposed. Table 1: Examples of tactics used by interpreters to cope with divergent word orders, limited working memory, and the pressure to produce low-latency translations. We show the source input (S), translated sentences (T), and interpreted sentences (I). The tactics are listed in the rightmost column and marked in the text: more general translations are highlighted in italics; 〈 ∥〉 marks where new clauses or sentences are created; and passivized verbs in translation are underlined. Information appearing in translation but omitted in interpretation are in (parentheses). Summarized expressions and their corresponding expression in translation are :::::::: underlined :: by :::: wavy :::: lines. 3 Classification of Translationese and Interpretese We investigate the difference between Translationese and Interpretese by creating a text classifier to distinguish between them and then examining the most useful features. We train our classifier on a bilingual Japanese-English corpus of spoken monologues and their simultaneous interpretations (Matsubara et al., 2002). To obtain a three-way parallel corpus of aligned translation, interpretation, and their shared source text, we first align the interpreted sentences to source sentences by dynamic programming following Ma (2006).5 This step results in 1684 pairs Sentences are defined by sentence boundaries marked in the corpus, thus coherence is preserved during alignment. of text chunks, with 33 tokens per chunk on average. We then collect human translations from Gengo6 for each source text chunk (one translator per monologue). The original corpus has four interpretors per monologue. We use all available interpretation by copying the translation of a text chunk for its additional interpretation. 3.1 Discriminative Features We use logistic regression as our classifier. Its job is to tell, given a chunk of English text, which translation produced it. We add `1 regularization to select the non-zero features that best distinguish Interpretese from Translationese. We experiment with three difhttp://gengo.com (“standard” quality). ferent sets of features: (1) POS: n-gram features of POS tags (up to trigram); 7 (2) LEX: word unigrams; (3) LING: features reflecting linguistic hypothese (Section 2), most of which are counts of indicator functions normalized by length of the chunk (Appendix A). The top linguistic features listed in Table 3 are consistent with our hypotheses. The most prominent ones—also revealed by POS and LEX—are the segmentation features, including counts of conjunction words (CC), content words (nouns, verbs, adjectives, and adverbs) that appear more than once (repeated), demonstratives (demo) such as this, that, these, those, segmented sentences (sent), and proper nouns (NNP). More conjunction words and more sentences in a text chunk are signs of segmentation. Repeated words and the frequent use of demonstratives come from transforming clauses to independent sentences. Next are the passivization features, indicating more passivized verbs (passive) and fewer pronouns (pronoun) in Interpretese. The lack of pronouns may be results of either subject omission during passivization or general omission. The last group are the vocabulary features, showing fewer numbers of stem types, token types, and content words in Interpretese, evidence of word generalization. In addition, a smaller number of content words suggests that interpreters may use more function words to manipulate the sentence structure. 3.2 Classification Results Recall that our goal is to understand Interpretese, not to classify Interpretese and Translationese; however, the ten-fold cross validation accuracy of LING, POS, LEX are 0.66, 0.85, and 0.94. LEX and POS yield high accuracy as some features are overfitting, e.g., in this dataset, most interpreters used “parsing” for “構文解析” while the translator used “syntactic analysis”. Therefore, they do not reveal much about the characteristics of Interpretese except for frequent use of “and” and CC, which indicates segmentation. Similarly, Volansky et al. (2013) and Eetemadi and Toutanova (2014) also find lexical features very effective but not generalizable for detecting Translationese and exclude them from analysis. One reason for the relatively low accuracy of LING may be inconsistent We prepend 〈S〉 and append 〈E〉 to all sentences. LING POS LEX CC + 〈S〉 CC + And + repeated + . CC + parsing + demo + 〈S〉 CC IN + gradual – sent + NN CC PR + syntax – passive + 〈S〉 CC DT + keyboard + pronoun – CC RB DT + attitudinal – NNP + , RB DT + text – stem type – . CC DT + adhoc + tok type – NN FW NN + construction – content – NN CC RB – Furthermore – Table 3: Top 10 highest-weighted features in each model. The sign shows whether it is indicative of Interpretese (+) or Transla-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Uniqueness of Human Strategies in Simultaneous Interpretation

Computational approaches to simultaneous interpretation are stymied by how little we know about the tactics human interpreters use. We produce a parallel corpus of translated and simultaneously interpreted text and study differences between them through a computational approach. Our analysis reveals that human interpreters regularly apply several effective tactics to reduce translation latency,...

متن کامل

A Parallel Corpus of Translationese

We describe a set of bilingual English–French and English–German parallel corpora in which the direction of translation is accurately and reliably annotated. The corpora are diverse, consisting of parliamentary proceedings, literary works, transcriptions of TED talks and political commentary. They will be instrumental for research of translationese and its applications to (human and machine) tr...

متن کامل

Functions of Human-to-Human Interpretation in Ethics

Anthropology is considered one of the most important philosophical issues and a platform for moral discussions. In this study, to explain the nature of man, three interpretations have been expressed: “interpretation of mankind as nature”, which took place in the period of modernism, according to which man is a creature equal to nature; “Interpretation of mankind as God” which arises from the te...

متن کامل

The Relationship Between Ruminating the Catastrophic Consequences of Bodily Changes and Positive Reappraisal and Practical Problem-Solving Strategies in Individuals With Illness Anxiety Disorder

Introduction: Cognitive emotion regulation is suggested to contribute to Illness Anxiety Disorder (IAD). Reappraisal and suppression are essential ER strategies with controversial data about their roles in IAD. Relevant studies are mostly limited to exploring these two strategies in individuals without such disorder. Therefore, we aimed to study the role of emotion regulation in the psychopatho...

متن کامل

Incremental Segmentation and Decoding Strategies for Simultaneous Translation

Simultaneous translation is the challenging task of listening to source language speech, and at the same time, producing target language speech. Human interpreters achieve this task routinely and effortlessly, using different strategies in order to minimize the latency in producing target language. Toward modeling the human interpretation process, we propose a novel input segmentation method us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016